Back

BMC Neurology

Springer Science and Business Media LLC

Preprints posted in the last 90 days, ranked by how well they match BMC Neurology's content profile, based on 12 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Validation of Gait Tasks in SynapTrack Mobile App for Cervical Spondylotic Myelopathy

Lewis, A.; Arkam, F.; Steel, B.; Chen, E.; Singh, P.; Yakdan, S.; Becker, I.; Guo, W.; Shahrabani, A.; Payne, P. R.; Ghogawala, Z.; Steinmetz, M. P.; Neuman, B.; Ray, W. Z.; Duncan, R.; Greenberg, J.

2026-05-29 surgery 10.64898/2026.05.27.26354225 medRxiv
Top 0.1%
20.1%
Show abstract

Background Gait impairment is a central sign of cervical spondylotic myelopathy (CSM) that is typically evaluated through subjective patient-reported questionnaires or objective in-clinic measures. These systems require substantial resources to administer and are poorly suited for longitudinal monitoring, however, emerging smartphone applications present an efficient alternative. We developed and assessed the validity of a data processing framework based on the SynapTrack smartphone application to assess gait function in individuals with CSM. Methods Participants completed walking tasks which were recorded on both the SynapTrack app and a gold standard gait mat. Acceleration data extracted from the smartphone by the app were filtered and processed to produce gait cycle features including velocity, step time, waveform features and frequency domain features. Standard gait features were compared across the two methods by correlation and Bland-Altman plots to assess validity. App-based gait features were then compared to the standard modified Japanese Orthopedic Assessment (mJOA) assessment to determine construct validity through correlation and ability to discriminate between individuals with CSM and healthy controls. Finally, intraclass correlation coefficients and coefficients of variation were used to measure test-retest reliability and standard variation across app features. Results A total of 110 participants were included in this study, of which 55 (50%) had CSM, 24 (22%) had peripheral neuropathy, and 31 (28%) were healthy controls. SynapTrack gait measures including velocity, step time, and double support showed strong validity as indicated through Bland-Altman plots and high correlation (>0.8) with mat features. In addition to the gait features, acceleration root mean square, acceleration crest, spectral entropy, and dominant frequency showed strong construct validity compared to the mJOA across correlation (0.2-0.54), trend test (p < 0.001), and AUROC (0.62-0.79) analyses. ICCs showed moderate test-retest reliability (0.52-0.67). Discussion The proposed framework for processing gait data showed strong validity compared to the gold standard mat and high construct validity compared to the mJOA suggesting the utility of the SynapTrack app as an efficient alternative to existing methods. The confirmation of gait metrics related to CSM severity and identification of relevant waveform and frequency domain features present opportunities to use smartphone apps to develop ecologically valid data driven markers of CSM severity.

2
Effect of levodopa treatment on gait in older adults with mild parkinsonian signs

Pongmala, C.; Roytman, S.; van Emde Boas, M.; Vangel, R.; Rosano, C.; Bohnen, N.

2026-06-06 geriatric medicine 10.64898/2026.06.04.26354926 medRxiv
Top 0.1%
10.1%
Show abstract

Background Slow walking in older adults with mild parkinsonian signs (MPS) is a complex, multifactorial phenomenon arising from the cumulative burden of subclinical age-associated pathologies. This decline reflects age-associated neuronal loss in the dopaminergic system. A recent study suggests that levodopa treatment may enhance gait parameters. The goal of this small pilot study is to explore the effect of levodopa treatment on slow walking gait in older adults with MPS. Method This study was a randomized, placebo-controlled clinical pilot trial. Slow walking older adults without clinical evidence of PD were recruited and randomized into 2 groups (active treatment group or placebo control group). Participants in the active group were pre-treated with carbidopa for three days, followed by carbidopa-levodopa for seven days. Spatiotemporal gait parameters were evaluated at baseline and post-intervention. Results Gait factor analysis identified three main factors explaining gait characteristics at baseline, which included gait efficiency, gait rhythmicity, and gait turning.No effect of treatment was observed in the placebo group (p=0.111, p=0.616), no group difference was observed between the placebo and active group at baseline ({beta}=0.310, p=0.547), but a strong trend for a treatment-related increase was observed in the active treatment group ({beta}=0.506, p=0.076). Conclusion Our preliminary data suggest that sustained levodopa treatment (one week) in conjunction with carbidopa pre-treatment and concomitant carbidopa supplementation is feasible in slow walking older adults with MPS. Moreover, the data indicate potential efficacy, showing improvements in cadence, and step durations.

3
Efficacy of tDCS and EEG Neurofeedback, individually and combined, on Neuropathic Pain following spinal cord injury: Protocol for a Randomised Controlled Trial

Chowdhury, N.; Hesam Shariati, N.; Quide, Y.; Zahara, P.; Herbert, R.; Restrepo, S.; Chen, K.; McIntyre, A.; Newton-John, T.; Middleton, J.; Craig, A.; Jensen, M. P.; Butler, J.; Briggs, N.; McAuley, J.; Gustin, S. M.

2026-03-18 pain medicine 10.64898/2026.03.11.26347999 medRxiv
Top 0.1%
6.4%
Show abstract

Neuropathic pain (NP) affects approximately 60% of individuals with spinal cord injury (SCI). Existing pharmacological treatments provide only modest relief and are often limited by adverse effects, while non-pharmacological options show small effects at best. As such, there remains a need for accessible, mechanism-informed treatments for SCI-NP. This protocol describes a trial evaluating two promising home-based neuromodulatory interventions for SCI-NP - electroencephalography neurofeedback (EEG-NF) and transcranial direct current stimulation (tDCS) - tested both independently and when applied in combination. We will employ a partially double-blinded (i.e. 1 treatment blinded, the other not), 2x2 factorial randomised controlled trial. Adults with chronic SCI-NP (N=192) will be randomised to: (1) EEG-NF + active tDCS, (2) EEG-NF + sham tDCS, (3) active tDCS alone, or (4) sham tDCS alone, in addition to treatment as usual. Participants will complete 20 home-based sessions over 5 weeks. The primary outcome is change in overall pain severity with the primary endpoint being 6 weeks post-randomisation, with secondary endpoints at 16, 26 and 52 weeks post-randomisation. Secondary outcomes (worst pain intensity, pain interference, sleep, depressive symptoms, health-related quality of life) will be assessed at 6 weeks, 16 weeks, 26 weeks and 52 weeks post-randomisation. This will be the first large-scale trial of home-based EEG-NF and tDCS for SCI-NP. If found to be effective, these scalable interventions could be integrated into routine care and inform further optimisation of neuromodulation strategies for managing SCI-NP.

4
Variation in Haemostasis and VTE Prophylaxis in Elective Adult Cranial Neurosurgery: A Global Survey of Perioperative Practice

Pandit, A. S.; Chaudri, T.; Chaudri, Z.; Vasilica, A. M.; Dhaliwal, J.; Sayar, Z.; Cohen, H.; Westwood, J. P.; Toma, A. K.

2026-04-16 surgery 10.64898/2026.04.14.26350905 medRxiv
Top 0.1%
5.2%
Show abstract

BackgroundVenous thromboembolism (VTE) remains a major cause of perioperative morbidity in cranial neurosurgery, yet clinical practice varies widely, and formal guidelines are inconsistent. Understanding internationally sampled neurosurgical practice is essential for informing consensus and future trials. MethodsAn international, 2-stage cross-sectional, internet-based survey was conducted. Practising neurosurgeons performing elective adult cranial surgery were eligible. Descriptive statistics were used to summarise practice. Responses covered patterns of pre-operative haemostasis decision making, use and timing of mechanical and/or chemical prophylaxis, use of perioperative imaging prior to anticoagulation, and frequency of clinical assessment for VTE. Associations with geographical income status, subspecialty, and years post-certification were statistically tested. Practice heterogeneity was quantified and contextual influence was summarised using mean effect sizes across stratifying variables in order to determine domains of true equipoise. ResultsOf 585 responses, 456 (78%) met criteria for inclusion: representing 322 units across 78 countries (71% high-income). Thirteen per cent reported no departmental VTE plan; 23% followed no guidelines and 12% used multiple. Routine pre-operative testing almost universally included haemoglobin/platelets/haematocrit, with fibrinogen more common in high-income settings. Compared with high-income country respondents, low- and middle-income respondents reported higher haemoglobin transfusion thresholds (>90 g/dL; p<0.001) and shorter antiplatelet interruption (p[&le;]0.03), and less frequent outpatient VTE assessment (p<0.001). Mechanical prophylaxis was common (TEDs 81%, IPC 62%), typically started pre-or intra-operatively. Among those completing the chemoprophylaxis section (n=310), 57% required a CT or MRI scan before LMWH which was then initiated on average 31.4 hours after surgery. 1% of respondents did not routinely use LMWH. Many clinical decisions demonstrated statistical equipoise ie. high heterogeneity with low contextual influence. ConclusionPeri-operative haemostasis and VTE prophylaxis practices in adult elective cranial neurosurgery vary substantially worldwide, with some decisions reflecting geographical or socioeconomic differences and many others reflecting true clinical equipoise rather than contextual determinants. By mapping contemporary real-world practice across diverse health-system contexts, this study provides a necessary empirical foundation for rational trial design and future guideline development.

5
Video-based Detection of Delirium in Hospitalized Adults

Mendu, M.; Tesh, R. A.; Pellerin, K.; Steward, G. E.; Cerda, I. H.; Williams, M.; Colman, M.; Shah, S.; Lam, A. D.; Cash, S. S.; Westover, M. B.; Kimchi, E. Y.

2026-05-13 geriatric medicine 10.64898/2026.05.11.26352902 medRxiv
Top 0.1%
5.0%
Show abstract

Delirium, a dynamic neuropsychiatric condition associated with morbidity and mortality, remains underdiagnosed due to reliance on subjective, intermittent screening tools. Objective and potentially continuous identification is needed to improve clinical care. We developed and validated an analytic framework for delirium classification based on automatically extracted video features. In this prospective cohort study, patients ([&ge;] 18 years) admitted to the inpatient medical or neurological ward of a tertiary academic center between August 2020 and March 2022 with an expected stay longer than one night were enrolled. Daily structured delirium assessments and brief video recordings were performed in consenting patients. Videos were analyzed using deep learning pose estimation to extract keypoints and calculate behavioral features based on eye, face, and limb postures and movements. Four machine learning models (logistic regression, gradient boosting, support vector machines, and random forests) were trained to predict delirium status from extracted features. Model performance was evaluated on 20 repetitions of three-fold cross-validation using the area under the curve of the receiver operating characteristics curve (AUC ROC). The cohort included 109 videos from 25 male and 25 female participants (median age: 72, IQR: 63.25-78). Twenty videos (18%) were from patients with delirium. Keypoints for this dataset were more accurately extracted using a customized ResNet-101 model developed with DeepLabCut (sensitivity 0.94, specificity 0.89, compared to human-labeled gold standards) than using off-the-shelf models. Keypoints were then used to generate behavioral features summarizing movement and postures throughout the video. A support vector machine model achieved an average delirium classification AUC ROC of 0.79 (SD {+/-} 0.09), sensitivity of 0.71 (SD {+/-} 0.16), and specificity of 0.78 (SD {+/-} 0.07). This study demonstrates the feasibility of identifying delirium using brief videos in clinically heterogeneous cohorts and reveals novel features for objective identification. Author SummaryDelirium is a sudden change in attention and awareness that commonly affects hospitalized patients. It is linked with longer hospital stays, cognitive decline, and death. Patients with delirium often show changes in movements and behaviors such as slowed movement, restlessness, or excessive scanning of the environment. Since current screening tools rely on intermittent human interactions, they can be subjective and miss the fluctuating nature of delirium, leading to underdiagnosis. We sought to explore whether short video recordings could be used to detect delirium automatically. In our study, we enrolled 50 hospitalized patients and conducted daily delirium assessments and video recordings. We used a machine learning model to analyze patients eye movements, facial expressions, and body postures. We found that video-derived features could be used to identify delirium in a small clinical cohort. While needing further validation in outside cohorts, this study shows an important proof-of-concept for objective delirium monitoring in heterogeneous clinical contexts without adding burden to clinical staff.

6
Benchmarking General-Purpose and Medical AI Large Language Models for Clinical Assessment and Management in Parkinson's Disease

Shechter, Y.; Klevor, R.; Kouchache, T.; Bouhadoun, S.; Postuma, R. B.

2026-05-20 neurology 10.64898/2026.05.13.26353021 medRxiv
Top 0.1%
4.9%
Show abstract

Background: The clinical applicability of large language models (LLMs) in Parkinson's disease (PD) management remains insufficiently characterized, particularly in generative responses to clinical vignette scenarios. Objective: To evaluate the quality of clinical assessments and management plans generated by a general-purpose LLM (Gemini 1.5 Pro) and a medically specialized LLM (OpenEvidence), and to compare their performance. Methods: Models generated free-text responses to 45 open clinical queries, focused on assessment of the situation, and recommended management plan. Two movement disorders fellows rated outputs using 5-point Likert scales, dichotomized into clinically appropriate ([&ge;]4) versus inappropriate ([&le;]3). Discrepancies were adjudicated by a senior movement disorders specialist. Paired comparisons used McNemar's test; qualitative analysis examined severe errors. Results: Gemini 1.5 Pro and OpenEvidence showed high rates of clinically appropriate assessments (80.0% vs. 86.7%) but lower performance in management plans (48.9% vs. 57.8%). Cases in which both assessment and plan were clinically appropriate occurred in 46.7% and 55.6% of cases, respectively. None of these differences reached statistical significance. Severe errors were uncommon in assessments (6.7% vs. 8.9%) but more frequent in plans (26.7% in both), predominantly reflecting treatment strategy errors. Conclusions: In generative clinical reasoning tasks involving Parkinson's disease management vignettes, LLMs demonstrated reasonable performance in assessment, but consistent limitations in plan generation. The medically specialized LLM demonstrated several qualitative advantages but no statistically significant performance benefit over the general-purpose model. Therefore, these tools should be used with appropriate caution in Parkinson's disease management, particularly regarding treatment recommendations.

7
Feasibility and Tolerability of Deep Repetitive Transcranial Magnetic Stimulation for Mild Neurocognitive Disorder in Older Adults (Deep MIND): Study Protocol

Rajani, M. I.; Yaya, H.; Vandehei, E.; Di Passa, A.-M.; McIntyre-Wood, C.; Prokop-Millar, S.; Krzyzanowski, D.; Zhang, M.; Fein, A.; MacKillop, E.; De Jesus, J.; Frey, B.; MacKillop, J.; Duarte, D.

2026-06-02 geriatric medicine 10.64898/2026.05.30.26354496 medRxiv
Top 0.1%
4.9%
Show abstract

Background:Mild neurocognitive disorder (NCD) is a condition in which individuals experience mild cognitive decline but are independent in their activities of daily living. Due to the increasing number of people living with mild NCD and its negative impact on the quality of life, it poses a significant health burden worldwide. Thus, it warrants an urgent need for innovative approaches to address the lack of effective treatment options. Deep transcranial magnetic stimulation (dTMS), a non-invasive neuromodulation technique approved for the treatment of various neuropsychiatric disorders, could serve as a novel intervention for mild NCD. It can stimulate deeper and broader areas of the brain implicated in mild NCD, such as the prefrontal cortex, insula, and anterior cingulate cortex. Objectives:This study will examine the feasibility and tolerability of the Health Canada and Food and Drug Administration (FDA) approved dTMS coils (H1, H4 and H7 coils) in individuals with mild NCD. Secondarily, it will assess the impact of dTMS on cognition, mood, sleep, anxiety, brain activity (via electroencephalography), and blood biomarkers of neurodegeneration and inflammation. Methods: This open-label pilot study will recruit a total of N=30 participants between the ages of 60-90 with mild NCD. Participants will be assigned to one of the three dTMS coil conditions (H1, H4 & H7) and will complete a total of 20 dTMS sessions over 6 weeks. Data will be collected before, during, immediately after, and one-month following the intervention period. Discussion: This pilot study will generate necessary evidence regarding the feasibility and tolerability of dTMS in mild NCD. This will be used to determine whether a definitive trial is justified and inform the trial procedures. In the long term, dTMS may address a critical gap in therapeutic options for mild NCD. Clinical Trial registration:The protocol was registered on Clinicaltrials.gov (CT07038798) on June 2nd, 2025.

8
Preliminary Reliability and Validity of SynapTrack, a Smartphone-Based Digital Biomarker Platform for Remote Assessment of Cervical Spondylotic Myelopathy

Yakdan, S.; Singh, P.; Arkam, F.; Chen, E.; Lewis, A.; Steel, B.; Becker, I.; Guo, W.; Naveed, H.; Wang, C.; Yang, D.; Wang, Z.; Ray, W. Z.; Hassenstab, J.; Steinmetz, M. P.; Ghogawala, Z.; Kelleher, C.; Greenberg, J.

2026-06-01 surgery 10.64898/2026.05.29.26354454 medRxiv
Top 0.1%
4.6%
Show abstract

Background and Objectives: Cervical spondylotic myelopathy (CSM) is a leading cause of neurological disability in older adults. However, validated, scalable tools to quantify disease severity and changes over time are lacking. Recent advances in smartphone technology have opened new avenues for longitudinal, objective, and remote monitoring of neurological conditions. We performed a preliminary evaluation of the reliability and validity of SynapTrack, a smartphone-based digital platform for objective remote CSM assessments. Methods: In this single-center prospective cohort study, 265 participants (151 with CSM, 114 healthy controls) completed in-person SynapTrack assessments related to tapping, pinching, and vibratory detection, along with reference laboratory measures of dexterity (Box and Block Test, 9-Hole Peg Test) and vibratory sensation (tuning fork). A subset completed repeated home-based testing to assess test-retest reliability. We evaluated convergent validity, construct validity against the modified Japanese Orthopedic Association (mJOA) score, known-groups validity, and test-retest reliability (intraclass correlation coefficient, ICC). Results: Smartphone-derived metrics demonstrated good-to-excellent test-retest reliability, with the strongest stability for vibratory detection threshold (ICC = 0.92), overall and non-dominant tapping speed (ICC = 0.90 each), and pinching successful targets (ICC = 0.90). Convergent validity was supported by moderate-to-strong correlations between digital metrics and reference laboratory dexterity tests ({rho} up to 0.60 for tapping speed; up to -0.65 for the vibratory threshold). Construct validity against the mJOA was strongest for the vibratory threshold ({rho} = -0.53 to -0.54) and Level 2 non-dominant pinching errors ({rho} = -0.45). Selected metrics distinguished CSM patients from controls with good discrimination, including non-dominant tapping speed (AUROC = 0.76, 95% CI 0.68-0.85), Level 2 dominant pinching successful targets (AUROC = 0.78, 95% CI 0.62-0.94), and the non-dominant vibratory threshold (AUROC = 0.77, 95% CI 0.64-0.90). Conclusions and Relevance: A smartphone-based battery of upper-extremity sensorimotor tasks demonstrated preliminary reliability and validity in CSM. Furthermore, to our knowledge, the novel vibratory detection task represents the first smartphone-based sensory assessment used for CSM. Collectively, these findings position SynapTrack as a scalable platform for objective, remote neurological monitoring of CSM.

9
The Case Against the 'S': Is Functional Neurological Disorder(s) One Condition or Many?

Palmer, D. D. G.; Edwards, M. J.; Mattingley, J.

2026-03-23 neurology 10.64898/2026.03.19.26348846 medRxiv
Top 0.1%
4.3%
Show abstract

BackgroundFunctional neurological disorder (FND) is one of the most common, but least researched, conditions in neurology. Debate exists as to whether the clinical entity referred to as FND is truly a single disorder or is in fact multiple entities which have been erroneously amalgamated into the same condition. We sought to provide empirical evidence on this question by treating it as a problem of model comparison. MethodsWe formulated statistical models equivalent to: (1) FND being a single entity with variation in phenotype, represented by latent trait (binary factor/item response theory) models, and (2) FND being multiple discrete entities, represented by latent class analysis (LCA) models. We fitted these models to data on the symptoms experienced by 697 people with FND from the FND Research Connect database (fnd-research.org) and used Bayesian model comparison methods to compare them. ResultsAll but one of the latent trait models, representing FND as a single entity with heterogeneous phenotype, fit the data better than all the LCA models. Secondary analysis of the LCA models showed results compatible with the models capturing discretisation of continuous variation rather than true discrete categories. DiscussionOur results suggest that the symptom structure of FND is the result of a single pathophysiological process, either as a single entity, or a common pathway preceded by multiple causative processes where the common pathway is solely responsible for the phenotype of the condition.

10
The MIND Study: Design, Feasibility, and Baseline Characteristics of a Smartphone-Based Migraine Cohort

Khorsand, B.; Teichrow, D.; Lipton, R. B.; Ezzati, A.

2026-04-21 neurology 10.64898/2026.04.14.26350866 medRxiv
Top 0.1%
4.2%
Show abstract

ObjectiveTo describe the design, feasibility, and baseline characteristics of the Migraine Impact on Neurocognitive Dynamics (MIND) study, a 30-day smartphone-based cohort for high-frequency assessment of cognition and symptoms in adults with migraine. BackgroundCognitive symptoms are an important component of migraine burden, but they are difficult to measure using single-visit testing or retrospective questionnaires. Repeated smartphone-based assessment may better capture real-world variability in cognition and symptoms. MethodsAdults meeting International Classification of Headache Disorders, 3rd edition, criteria for migraine were enrolled remotely and completed 30 days of once-daily ecological momentary assessments and mobile cognitive tasks delivered through the Mobile Monitoring of Cognitive Change platform. Baseline measures assessed demographics, migraine characteristics, disability, mood, stress, and treatment patterns. Feasibility was evaluated using enrollment, completion, and retention metrics. ResultsA total of 177 participants enrolled (mean age 38.8 {+/-} 11.9 years; 79.7% female), including 80/177 (45.2%) with chronic migraine. Across the 30-day protocol, 3688 daily assessments were completed, representing 70.8% of all possible study days, and 70.6% of participants completed at least 20 days of monitoring. Completion remained above 60% across study days. At baseline, chronic migraine was associated with greater burden than low-frequency and high-frequency episodic migraine, including higher MIDAS scores (98.6 vs. 38.7 and 70.3), more days with concentration difficulty (16.0 vs. 7.9 and 11.5), and more days with functional interference (18.5 vs. 7.6 and 13.0). ConclusionsThe MIND study demonstrates the feasibility of high-frequency smartphone-based assessment of cognition and symptoms in migraine and provides a methodological foundation for future analyses of within-person cognitive and symptom dynamics across the migraine cycle.

11
Artificial intelligence-generated digital Romberg test for peripheral neuropathy monitoring.

Tejada-Illa, C.; Pi-Cervera, A.; Pegueroles, J.; Claramunt-Molet, M.; Heras-Delgado, A.; Gascon-Fontal, J.; Idelsohn-Zielonka, S.; Rico, M.; Vidal-Fernandez, N.; Martin-Aguilar, L.; Caballero-Avila, M.; Lleixa, C.; Collet-Vidiella, R.; Moreno, J.; Mederer-Fernandez, T.; Llanso, L.; Carbayo, A.; Vesperinas, A.; Querol, L.; Pascual-Goni, E.

2026-05-15 neurology 10.64898/2026.05.12.26353015 medRxiv
Top 0.1%
4.1%
Show abstract

Background and Objectives Patients with peripheral neuropathies (PN) commonly exhibit balance impairment. In clinical practice, balance is typically assessed using the Rombergs test and ataxia scales, which rely on examiner interpretation, while objective biomarkers for quantifying balance remain lacking. Wearable sensors are valuable tools for objectively quantifying gait abnormalities in PN patients and may capture clinically meaningful changes over time. By integrating these parameters, artificial intelligence (AI) can assist in generating a digital score that enables easy, objective, and reproducible monitoring of patients postural balance. This study aims to generate and assess an AI-generated digital Rombergs test to quantify balance impairments in a cohort of PN patients. Methods PN patients were assessed in a longitudinal study using a wearable system composed of inertial sensors placed on the trunk and plantar pressure sensors integrated in insoles. Patients performed the Rombergs test under both eyes-open and eyes-closed conditions and were classified according to ataxia severity (mild, moderate, or severe) following the score obtained in item 1 of MICARS and SARA scales. Results We included 97 patients with PN (including autoimmune and hereditary polyneuropathies), and 117 healthy controls (HC). Significant differences in trunk sway and center of pressure (COP) were observed between groups, particularly with eyes closed. Using wearable sensor parameters, we developed an AI digital Rombergs test, which correlated with clinician-rated Rombergs test performance and distinguished patients with and without ataxia (AUC=0.632) and across different PN pathologies. Longitudinally, digital Rombergs test and iRODS showed concordant trajectories. Also, changes [&ge;]25% in the score were associated with clinical changes in ataxia severity measured by an increase in MICARS-SARA score (+1.42 points), whereas improvement was associated with a decrease (-0.20 points) in the scale. Discussion This study demonstrates that wearable sensors are useful to detect and quantify balance impairment. The AI-generated Rombergs test is an objective and reproducible tool for postural balance assessment, with robust discriminatory performance across clinical ataxia severity in PN. Scores longitudinal changes aligned with clinical severity, supporting its potential for monitoring disease progression and treatment response. Its strong association with balance measures reinforces its role as a quantitative biomarker of postural control in ataxia patients.

12
Predicting long term clinical outcomes in Parkinson's Disease using short term rating scales

Burnell, M.; Gonzalez-Robles, C.; Zeissler, M.-L.; Bartlett, M.; Clarke, C. S.; Counsell, C.; Hu, M. T.; Foltynie, T.; Carroll, C.; Lawton, M.; Ben-Shlomo, Y.; Carpenter, J.

2026-03-30 neurology 10.64898/2026.03.27.26349548 medRxiv
Top 0.1%
3.6%
Show abstract

Background: Most trials of Parkinson's disease (PD) measure progression over a short to medium time-period using continuous rating scales that may be hard to interpret and less meaningful for patients. There is a lack of evidence connecting changes in these scales to changes in outcomes important to patients. Objectives: We present causal modelling to translate the causal, short-term disease-modifying treatment effects on functional rating scales to the 10-year risk of serious clinical progression milestones. Methods: We selected four important clinical milestones of disease progression from the Oxford Parkinson's Disease Centre "Discovery" cohort: dementia, any falls, frequent falls, and mortality. We proposed a causal framework for our research objectives so we could model the potential impact of a 30% reduction in disease progression slopes ("treatment effect") using the summation of parts I and II of the Movement Disorders Society Unified Parkinson's Disease Rating Scale (UPDRS12). This outcome was regressed on time to milestone using flexible parametric survival models. Marginal predictions of survival and survival difference at year 10 were then calculated for the Discovery cohort, and a counterfactual cohort applying the treatment effect to estimate the relative and absolute reductions for the four clinical milestones. Results: The model increase in risk for each unit change in the UPDRS12 were as follows: dementia hazard ratio (HR)=1.52 (95% Confidence Interval (CI) 1.36-1.70), any falls HR=1.37 (95% CI 1.29-1.46), frequent falls HR=1.68 (95% CI 1.49-1.89), mortality=1.29 (95% CI 1.17-1.42). These models led to marginal predictions of absolute reductions, when the progression was reduced by 30%, between 4.0% (mortality) and 7.5% (frequent falls) at 10 years follow up. Conclusions: We have demonstrated how a treatment effect in a trial specified in terms of a progression change of a rating scale can be contextualised into a long-term reduction in the probability of clinically relevant milestones. Whilst we have used PD as our exemplar, we believe this methodological approach is generalisable to other chronic progressive diseases where trials are often limited to a relatively short follow-up period and use some scalar measure of progression, but significant clinical milestones usually take longer to be observed. Keywords: Clinical trials; disease modifying therapies; causal estimation; prediction models

13
Efficacy of virtual reality treatment of phantom leg pain: Results of a randomized clinical trial

Ambron, E.; Williamson, R.; Li, J.-S.; Karrenbach, M.; Rombokas, E.; Coslett, H. B.; Buxbaum, L. J.

2026-04-28 pain medicine 10.64898/2026.04.20.26350810 medRxiv
Top 0.1%
3.6%
Show abstract

Approximately 90% of individuals with limb amputation experience the persistent sensation of the missing extremity and up to 85% experience debilitating pain in the missing limb, a condition termed phantom limb pain (PLP). In this registered clinical trial (NCT05296265), we tested the efficacy of Virtual Reality (VR) treatment of phantom leg pain in a sample of transtibial and transfemoral amputees with PLP. Adaptive randomization was used to assign 36 participants (19 transfemoral, 17 transtibial) recruited across three study sites to eight sessions of an active or distractor VR treatment. The active VR treatment required leg movements and provided virtual visual feedback. The distractor treatment was a commercially available VR treatment for pain based on the principle of distraction. The primary outcome measures were the comparison of ratings of pain intensity and quality at baseline versus immediately post-treatment and at 1-week and 8-week follow up. The secondary outcome measure, obtained in each session, was average pain intensity since the last treatment. Pain on both intensity and quality measures was significantly reduced with moderate effect sizes for the active treatment only; intensity effects persisted at 1-week follow-up, and quality effects persisted at 8-weeks follow-up. Ratings of pain intensity since the last treatment showed a large effect size for the active treatment and was significant for both treatments. This clinical trial showed significant efficacy of VR treatment for PLP, particularly for an active treatment providing virtual visual feedback of the amputated limb.

14
Characteristic resting state facial expressions in older adults with mild cognitive impairment level

Miyayama, M.; Sekiguchi, T.; Sugimoto, H.; Kawagoe, T.; Tripanpitak, K.; Wolf, A.; Kumagai, K.; Fukumori, K.; Miura, K. W.; Okada, S.; Ishimaru, K.; Otake-Matsuura, M.

2026-04-11 geriatric medicine 10.64898/2026.04.10.26350581 medRxiv
Top 0.1%
3.4%
Show abstract

BackgroundFor early detection of Alzheimers disease, it is essential to identify individuals showing cognitive performance consistent with the mild cognitive impairment (MCI) range during preliminary screening, ideally using methods that extend beyond conventional cognitive assessments. Non-invasive, easily accessible screening tools applicable in daily life are increasingly needed. Facial expressions, particularly during rest, may offer promising biomarkers for MCI level detection. This study aimed to identify specific facial features associated with MCI level during rest to inform development of facial expression-based screening tools. MethodsParticipants were classified into an MCI level group and a healthy control (HC) group based on the Montreal Cognitive Assessment (MoCA) scores. Facial Action Units (AUs) were extracted from video recordings of resting-state facial expressions in 31 individuals with MCI level and 14 HC. Two statistical models were employed: a multilevel zero-inflated beta regression model for intensity of 17 AUs and a multilevel logistic regression model for presence or absence of 18 AUs. ResultsIn the zero-inflated beta regression, the AU relates to upper lip raiser showed a significant group effect (MCI level vs. HC; p <0.001), remaining significant after multiple comparison correction. The logistic regression revealed significant group differences for the AUs related to lip tightener (p <0.001) and lip suck (p <0.001), both remained significant after multiple comparison correction. ConclusionsDistinctive facial action patterns during rest were observed in individuals with MCI level. These findings highlight the potential of resting-state facial expressions as a basis for novel, unobtrusive screening tools for early MCI level detection.

15
Characteristics of Highly Creative Surgeons (The INSPIRE Study): An International Mixed-Methods Study Protocol

Thabane, A.; McKechnie, T.; Staibano, P.; Scheau, C.; Dragosloveanu, S.; Guerra Farfan, E.; Sajol, R. R.; Arora, V.; Calic, G.; Parpia, S.; Busse, J. W.; Hamoudi, N.; Patel, D.; Reiter-Palmon, R.; Bhandari, M.

2026-05-19 surgery 10.64898/2026.05.15.26353308 medRxiv
Top 0.1%
3.2%
Show abstract

Introduction Creativity is important in surgery for problem-solving in the operating room and the development of surgical innovations that improve patient outcomes. However, our limited understanding of what the characteristics and competencies of the highly creative surgeon are has inhibited our ability to develop the tools, programs and interventions necessary for cultivating the creativity of surgeons. We present the protocol for the INSPIRE Study, which aims to identify the factors associated with high creative achievement in surgeons. Methods and Analysis We have designed a sequential mixed-method study, including a cohort study accompanied by qualitative semi-structured interviews. The primary objective of this study will be to identify factors associated with high creative achievement in surgeons, to be assessed through direct involvement in innovation or invention, or a top score (10 out of 10) on any domain in the Inventory of Creative Activities and Achievements questionnaire. We plan to measure 39 different personal, domain-specific, domain-general, and environmental/motivational variables, chosen based on previous literature and on exploratory grounds, to be assessed as possible factors of creative potential. Multivariable logistic regression is planned, with high creative achievement as the dependent variable and all 39 potential factors of creative potential as independent variables. Ethics and Dissemination Ethics approval from the Hamilton Integrated Research Ethics Board has been obtained and no harm is expected due to participation in this study. To facilitate knowledge translation, we plan to publish the feasibility data and results in peer-reviewed journals, and present at international surgical and creativity conferences.

16
Large Language Model Performance in UK Advice & Guidance: A Pilot Study in Neurology

Healy, J.; Marvasti, A.; Wallace, D.; Baheerathan, A.; Ghosh, A.; Kossoff, J.; Thio, S.; Balaratnam, M.; Haider, S.; Ellershaw, S.; Dobson, R.

2026-05-18 neurology 10.64898/2026.05.13.26353081 medRxiv
Top 0.2%
3.1%
Show abstract

Background: Large language models (LLMs) demonstrate strong performance in controlled medical environments such as multiple choice exams, but their utility in real-world clinical workflows remains unproven. The NHS Advice & Guidance (A&G) service, where Primary Care clinicians can submit text-based queries to specialists, provides an environment for evaluating the clinical performance of LLMs as a specialist. Methods: We compared responses from MedGemma 4B-IT, an open-weight model deployed locally on hospital infrastructure, against specialist neurologist responses across 50 adult neurology A&G cases from University College London Hospital. Two neurologists and two GPs rated 80 blinded and 20 unblinded responses for outcome, safety, efficacy, and feasibility using standardised criteria; outcome was a binary correct/incorrect, while other domains were scored 1-5. Inter-rater reliability was assessed using intraclass correlation coefficients. Results: Although there were no statistically significant differences between blinded specialist neurologists and LLM responses across any domain (outcome: 84% vs 82%, p=0.67; safety: 3.98 vs 4.02, p=0.85; efficacy: 4.06 vs 3.98, p=0.61; feasibility: 4.39 vs 4.20, p=0.45), 10% of LLM responses received concerning scores ([&le;]2 average score) compared to 0% of human responses, indicating potentially clinically important tail risk. Furthermore, unblinded results showed a preference for human responses, with human ratings being preferred across all domains. Only 51% of binary outcomes had unanimous agreement and inter-rater agreement was moderate across other domains (ICC 0.50-0.52). Conclusions: In this pilot study, aggregate scores between blinded human and LLM responses were similar, and no statistically significant differences were detected in this exploratory sample. However, aggregate metrics masked clinically important edge-case failures in LLM responses. Pronounced inter-rater variability and the potential impact of LLM/human syntax on blinded rater judgements highlight the challenges in establishing robust evaluation frameworks for clinical LLM deployment

17
Cognitive Outcomes After Stenting and Endarterectomy: A Systematic-Review and Meta-Analysis

Ertl, W. J. P.; Ward, J.; Twomey, Z. A.; Call-Orellana, F.; Verma, U.; Jen, S. S.; Shakir, H. J.

2026-05-10 surgery 10.64898/2026.05.04.26351899 medRxiv
Top 0.2%
3.1%
Show abstract

BackgroundCarotid artery stenosis may contribute to cognitive impairment through chronic hypoperfusion and subclinical ischemic injury. Although carotid endarterectomy (CEA) and carotid artery stenting (CAS) reduce stroke risk, their cognitive effects remain unclear. We conducted a systematic review and meta-analysis to evaluate cognitive outcomes after these interventions. MethodsFollowing PRISMA guidelines, we searched PubMed, Embase, Web of Science, and the Cochrane Library from January 2000 to November 2025. Eligible studies reported cognitive outcomes after CEA or CAS, either alone or in direct comparison. Random-effects meta-analyses were performed for Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA) outcomes. Single-arm studies were analyzed using standardized mean change, and head-to-head studies using mean difference. Outcomes were stratified by intervention type and follow-up interval (0-6 months and >6 months). Domain-specific cognitive outcomes were summarized qualitatively. Risk of bias was assessed using RoB-2, ROBINS-I, and the Newcastle-Ottawa Scale. ResultsSixty-eight studies including 4,659 patients met inclusion criteria; 27 contributed to meta-analysis and 41 to qualitative synthesis. MMSE showed no significant early change after either intervention, while CAS showed significant improvement at >6 months. MoCA improved significantly after both CEA and CAS at early and late follow-up, although heterogeneity was high. Head-to-head analyses found no significant difference between CEA and CAS for MMSE or MoCA, but these comparisons were limited by small sample sizes. Domain-specific outcomes were mostly stable, with improvements most often reported in memory, attention, executive function, and processing speed. ConclusionsCarotid revascularization may be associated with improved cognitive outcomes, particularly on MoCA, but results are heterogeneous and largely observational. Comparative evidence does not show a clear cognitive advantage of CEA or CAS. Future studies should use standardized cognitive testing and adequately powered direct comparisons.

18
Comparative Efficacy and Safety of Calcitonin Gene-Related Peptide Monoclonal Antibodies Versus Oral Gepants for Episodic Migraine Prevention: A Bayesian Network Meta-Analysis of Randomized Controlled Trials

Kakde, S. P.; Arora, N.; Kakde, M. P.; Kakade, S. P.

2026-05-24 neurology 10.64898/2026.05.18.26352539 medRxiv
Top 0.2%
3.1%
Show abstract

Background. Calcitonin gene-related peptide (CGRP)-targeted therapies, including injectable monoclonal antibodies (mAbs: erenumab, fremanezumab, galcanezumab, eptinezumab) and oral gepants (atogepant, rimegepant), represent a paradigm shift in episodic migraine prevention. No direct head-to-head trials across the full drug class exist. We conducted a PRISMA-NMA-compliant Bayesian network meta-analysis (NMA) to compare the relative efficacy and tolerability of all approved CGRP-targeted preventive therapies. Methods. PubMed, Embase, and Cochrane CENTRAL (inception to January 2026) were searched for doubleblind RCTs in episodic migraine. A Bayesian random-effects NMA used Markov Chain Monte Carlo simulation. Primary outcome: change in monthly migraine days (MMD). Secondary outcomes: 50% or greater responder rate, TEAEs, and DAEs. SUCRA probabilities quantified treatment rankings. Transitivity was formally assessed. Publication bias was evaluated using comparison-adjusted funnel plots and Egger test. GRADE certainty was rated for all key comparisons. Results. Thirty-two RCTs (24,418 participants; mean age 39.2 years; 84% female; mean baseline 8.2 MMD) were included (Table 1). All active treatments significantly reduced MMD versus placebo. Eptinezumab 300 mg ranked highest for MMD reduction (MD 2.40 MMD, 95% CrI 3.10 to 1.70; SUCRA 91.2%), followed by galcanezumab 240 mg (SUCRA 85.4%) and erenumab 140 mg (SUCRA 79.8%). For the 50% responder rate, galcanezumab 240 mg ranked highest (OR 3.12, 95% CrI 2.22 to 4.38; SUCRA 92.1%). Oral gepants demonstrated significant but more modest efficacy: atogepant 60 mg (SUCRA 38.4%) and rimegepant (SUCRA 28.9%). The absolute mAb-versus-gepant efficacy difference of approximately 1.1 MMD exceeded the accepted minimal clinically important difference. Gepants demonstrated placebo-comparable tolerability (TEAE RR 1.02, 95% CrI 0.93 to 1.12; SUCRA 93 to 96%). Heterogeneity was low to moderate (I-squared 14 to 31%); no significant network inconsistency (node-split p greater than 0.29); and no significant publication bias (Egger test p = 0.24). GRADE certainty was high for class-versus-placebo comparisons and moderate for indirect mAb-versus-gepant comparisons. Conclusion. CGRP mAbs provide superior efficacy over oral gepants for episodic migraine prevention. Oral gepants offer placebo-comparable tolerability. An individualized, patient-centered approach guided by symptom burden, comorbidities, administration preference, and the efficacy-tolerability tradeoff of each drug class is recommended.

19
Individualized Forecasting of Headache Attack Risk Using a Continuously Updating Model

Houle, T. T.; Lebowitz, A.; Chtay, I.; Patel, T.; McGeary, D. D.; Turner, D. P.

2026-04-22 neurology 10.64898/2026.04.20.26350119 medRxiv
Top 0.2%
3.0%
Show abstract

ImportanceMigraine attacks often occur unpredictably, limiting the ability of individuals to initiate timely preventive or preemptive treatment. Short-term probabilistic forecasting of migraine risk could enable more targeted management strategies. ObjectiveTo externally validate the previously developed Headache Prediction Model (HAPRED-I), evaluate an updated continuously learning model (HAPRED-II), and assess the feasibility and short-term safety of delivering individualized probabilistic migraine forecasts directly to patients. Design, Setting, and ParticipantsProspective 8-week cohort study conducted remotely at two academic medical centers in the United States (Massachusetts General Hospital and Wake Forest Health Sciences) between 2015 and 2019. Adults with recurrent migraine or tension-type headache completed twice-daily electronic diaries. A total of 230 participants contributed 23,335 diary entries across 11,862 participant-days of observation. Main Outcomes and MeasuresOccurrence of a headache attack within 24 hours following each evening diary entry. Model performance was evaluated using discrimination (area under the receiver operating characteristic curve [AUC]) and calibration. ResultsExternal validation of HAPRED-I demonstrated modest discrimination (AUC, 0.59; 95% CI, 0.57-0.61) and poor calibration, with predicted probabilities consistently exceeding observed headache risk. In contrast, the continuously updating HAPRED-II model demonstrated progressive improvement in predictive performance as participant-specific data accumulated. Discrimination increased from an AUC of 0.59 (95% CI, 0.57-0.61) during the first 14 days to 0.66 (95% CI, 0.63-0.70) after the first month, accompanied by improved calibration across predicted risk levels. Over the study period, 6999 individualized forecasts were delivered directly to participants. No evidence suggested that receipt of forecasts was associated with increasing headache frequency or worsening predicted headache risk trajectories. Conclusions and RelevanceA static migraine forecasting model demonstrated limited transportability to new individuals. In contrast, models that continuously update within individuals may improve predictive accuracy over time and enable real-time delivery of personalized migraine risk forecasts. Further work incorporating richer physiologic and contextual predictors will likely be necessary before such systems can reliably guide clinical treatment decisions.

20
A diagnostic model based on differential whole-brain dynamics for distinguishing neuropsychiatric symptom and cognitive impairment

Huang, L.; Yan, M.; Deng, Z.; Lv, Y.; Yu, W.

2026-04-28 neurology 10.64898/2026.04.27.26351804 medRxiv
Top 0.2%
2.7%
Show abstract

ObjectivesNeuropsychiatric symptoms (NPS) are prevalent in individuals of cognitive impairement (CI). However, the similarities and disparatenesses in whole-brain dynamics between individuals of CI and NPS are controversy. Electroencephalography (EEG) microstates reflect the whole-brain dynamics. This study aimed to investigate the differential EEG microstates parameters between CI and NPS and to construct related diagnostic model. Methods/designThis study was a cross-sectional study. Clinical and EEG data were collected, and an EEG microstate analysis were performed. The Least absolute shrinkage and selection operation (LASSO) regression model was used to identify significant differential EEG microstates parameters between CI and NPS and to construct a diagnostic model. The model performance was tested by the receiver operating characteristic curve (ROC). ResultsThis study enrolled 78 participants. A total of 36 EEG microstates parameters were identified and included in the differential analysis. In the LASSO regression model, 4 significant differential EEG microstates parameters were selected, including the duration of class C, TPAB, TPBA, and TPDC. The ROC results showed that the diagnostic model for distinguishing NPS patients from CI patients achieved an area under the curve (AUC) of 0.905(95% CI: 0.784-1.000), with a sensitivity of 100.0% and a specificity of 76.9%. ConclusionsThe diagnostic model based on EEG microstate parameters showed a good performance for differentiating NPS patients from CI patients.